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(54) Tumour suppressor gene 

(57) A detailed genetic map on human chromosome 
1 1 was prepared. Then, a commonly deleted region on 
the chromosome in the tumor tissues of patients with 
multiple endocrine neoplasia type 1 was identified. Fur- 
then by the linkage analysis on a family line with this dis- 
ease, a gene causative of this disease was localized. A 
gene present in the region common to these observa- 
tions was cloned and the structure of this gene was 
determined. Because a protein coded by this DNA is 
homologous with those of transcriptional factors, it is 
expected that the above-mentioned gene may be a 
novel tumor suppressor gene. Further, it is also 
expected that the above-mentioned gene and a protein 
coded for thereby may be useful in preparations of a 
remedy for cancer and a diagnostic drug for cancer. 
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Description 

Background of the Invention 
5 Reld of the Invention 

'o Description of the Related Art 

**** 15660 l ™ m for a lon 9 « me »«t gene mutation in cells plays an important role in the onset of cancer Rerpnf 
advances in genebc engineering have made H possible to amplify specific DNAs anri tn =n2!„ ^1 

«^<-Onmeo^ 

intense research merest in these stvprai voarc Cy.,^^ ^ *u ^ ,w,on nave Deen * ne focus of 

^ir^geneo^^^ 

Nne. D.R. et aj.. Nature, m, 261 (1979)] and ARC gene [Kenneth. W.£tf SXffiSr 
cancer and WT1 gene of Wilms' tumor [Call. K.M. etal Cell 60 509fl99mi ln^!ef!^^ 0991)1 of colon 

considered that there are much more unidentified tumor suppressor genes 747 ( 1 **>)i- However. rt B 

d^tSt^" 0 , " e( * laSia ^ 1 (MEN1) is 80 autosomal dominant hereditary disease characterized bv the 



/5 



20 



40 



Disclosure of the Invention 
Summary of the Invention 



45 



50 



55 



The present inventors isolated cosmid clones containing a number of RFLP markers on chiU™^™ 1 1 «^ 

An org^smTecS^^c ( r } P ^ growtn response protein 2 (EGR2) was isolated. 

« a r™?i» y ^ OTdS to vanous exo9enous and endogenous stimuli by comprehensively utjlizino for 
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cDNA in the full length is an intranuclear transcriptional regulator having a nuclear localizing signal, a proline-rich 
domain and a zinc finger motif. 

Thus, the present invention relates to: 

(1) a DNA comprising the full structure or a part of the DNA represented by SEQ ID NO:1 ; 

(2) a polypeptide comprising the full structure or a part of the polypeptide coded for by the DNA represented by 
SEQIDNO:1; 

(3) a transformant transformed by a plasmid having, integrated therein, the full structure or a part of the DNA rep- 
resented by SEQ ID NO:1 which can be expressed therein; 

(4) an antibody against the above-mentioned polypeptide as an antigen; and 

(5) a gene analysis method which comprises using, as a primer, a probe or a marker, a DNA comprising a part of 
the DNA represented by SEQ ID NO:1 and hybridizing the primer, the probe or the marker with a DNA to be tested. 

In other words, the present invention relates to: 

(a) a cDNA which comprises one containing the full or a part of the cDNA of the tumor suppressor gene repre- 
sented by SEQ ID NO: 1; 

(b) a polypeptide which comprises one containing the full or a part of the polypeptide coded for by the cDNA of the 
tumor suppressor gene represented by SEQ ID NO:1 ; 

(c) host cells which are obtained by integrating the full or a part of the cDNA described in SEQ ID NO:1 into a plas- 
mid which can express it and transforming thereby, 

(d) an antibody against the polypeptide described in the above item (b) as an antigen; and 

(e) a gene analysis method characterized by using a DNA containing a part of the DNA sequence described in the 
above item (a) as a primer, a probe or a marker. 

With respect to the DNAs and polypeptides, those which are substantially equivalent to the DNAs and polypeptides 
described above are also included in the scope of the present invention. The expression "DNAs and polypeptides being 
substantially equivalent" means those which have been modified via, for example, deletion, replacement, addition or 
insertion of the constituting bases or constituting amino acids and derivatives thereof, which exhibit the same effects as 
those of the original DNAs or polypeptides. However, the extent of these effects is irrelevant thereto. The term "a part 
of the DNA" means a fragment composed of at least 10 bases derived from the DNA. In order to employ as a primer, 
for example, a DNA fragment having a base sequence generally consisting of 10 to 30 bases, preferably 15 to 25 bases, 
is selected. In order to employ as a probe, a DNA fragment having a base sequence generally consisting of at least 1 5 
bases, preferably at least 20 bases, is selected. « — 

The term "a part of the polypeptide" means a peptide having a sequence composed of at least 6 amino acid resi- 
dues derived from the polypeptide. When a part of a polypeptide is to be used as an antigen for the preparation of an 
antibody or as an epitope for the detection of an antibody, it is known that a peptide having a sequence consisting of 6 
amino acid residues would bind to an antibody [see WO 8403564, published on Sep. 13, 1984 (Assignee: COMMON- 
WEALTH SERUM LABS and GEYSEN, H. M.)]. A peptide having a sequence generally consisting of at least 10 amino 
acid residues, preferably at least 20 amino acid residues, is employed therefor. Although it may be anticipated that a 
peptide having a sequence consisting of 6 amino acid residues can achieve only a poor efficiency in the production of 
an antibody, such a peptide is also usable in the form of a fused peptide. 

Furthermore, an RNA which comprises one translated from the DNA represented by SEQ ID NO:1 or a part of the 
same and RNAs which are substantially equivalent thereto are included in the scope of the present invention. 

Now the present invention will be described in greater detail. 
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Detailed Description of the Invention 
(1) Isolation of cDNA 
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Km ^Sl^lSr .M chromosome 1 1 can be prepared in. for example, the following manner 

£ COnfainin9 a Sin9le human chromosome 11 in a mouse genomic background a 

Pr^ ^M^Tr? T - 61 ^ Motecu,ar ^ 2nd. ed. Cold Spring Harb^bSy 

Press. NY. (1989H. Clones havmg an insert originating in the human chromosome can bescreened by SecoK 
hybndizaton wrth the use of a whole human DMA as probe. The thus obtained cosmid clones containing a origT 
nating in the , huma, , cnromosome 11 are then subjected to the fluorescent irvsitu hybridization STmZS- 
tT^^^Z^L^ 1 4 - 16 (1 99W Thus, each of the rmrtitude of Zcosmid dLeTc^ betooS 
S^^STT!^ f chromosoTOl «»Pcanbe prepared. Further. RFLP markers inte S 

Sf^L length pattern which has been prepared by cleaving human DNA with several SsST 

toZ^XST^Z t^i HUm • G "*'»«*-»fl«* . Among these clones, those located aSS 

Zi^PR p^* 6 DNA °! aCancertisaj90,a P*«* fe "ami"* in the loss of heterozygosityYLOm^ 
^r^^ U ! 6 r69 "^ tlefe * e ^ lumor Slppressor S 6 " 6 « >«=ated can be further iSi to 
From the eosmid clones existing in the region which has been thus restricted to. a DNA fraoment beim under 

= on can be feo^ed by the exc* tapping metixxJ [Buckler, A. et al.. Proc Nat.. Sit Sd. USaS S3- 

s?rrsri^ 

(2) Confirmation of the whole structure of the gene 

pJH?? TST^ ™ CDNA cm 69 detemlined «V the Maxam-Gilbert method [Maxam. A.M. and Qfoert W 
ProC H NattA^.Sa.USA.B.560(1977)]orthedideoxy technique [Messing. J., r^eic add Res.. £ 309 S)] 
hJJS J^T ^ ** eXanple - me 5>RACE method ' the 3-RACE method or the Northern Wotting that the 
cDNA obtained by the above-mentioned method contains the full length protein translation region. 

(3) Recombinant expression vectors and transformants transformed thereby 

The tumor suppressor gene cDNA obtained by the above-mentioned method, or a fragment thereof is inteorated 
tunng the transforms in a conventional manner, a large amount of the tumor suppressor gene product or a fraoment 

IslT™^ th : eXpreSSI ° n °' 816 CDNA b * a knoWn method use of restriction enzymes and MA 

™ recombmant «press.on vector can be constructed. Examples of the vectors usable therefor indude olas- 
r^dspRP^andpUCISoriginatinginisdj^ 

ongoing ,„ yeast phage vectors AgtIO and Xgt1 1 . and vector SV40 originating in ^JlStS^Z^ 
^.^tr" 9 "2 TfiS me h0St 061,8 be «— feretor witholr, restriction, sfm StL P ™ 
m?h!^ trr^ 6 r "I* 01 ' 13 ' md ^ SUitable "'""nation may be selected therefor depending on 

T« 1^!^? 9 33 ^ host ^oy^ 1 in ^e expression of at)NA sequence coding to 

the tomor suppressor gene, or a fragment thereof. Any DNA may be used as the cDNA herein so l^g as it axles for 
£ Sl^r° r9e, ! e , ProdUd l" .■*■«"»«**«* AchemicaBy synthesized one may be used therefor. Wh^ 
tne protein to be expressed is one having a physiological activity of suppressing the proliferation of cancer cells then 
DNA^asrf «ki+'I^ res ' r ' c t ec ' to m e DNA sequence represented by the SEQ IDNf^i^jt^aXWA^I^ng^i 
ZfcrTsZcfC Undef9 ° ne ^ SUteti,Uti0n ' de)efcn - inserton « a combination thereof may be3 

^e recombinant expression vector thus obtained is introduced into a host by. for example the comnetent cell 
method [J. Mot Bio... a 154 (1970)1. the protoplast method [Proc. NatL Acad. Sd. USA. SS(5STSS£ 
phc^hate method [Sdence. 22L 551 (1983)]. the in vitro packaging method [Proc AcS Sd U8?2w 

S« yeasts and animal celts are usable as the host. The transforms thus obtained is thentuL^waSS 
ate metfum selected depending on the employed host The culture is usually effected al a temped 

SI^T^T^ ™L S6para,ed 3nd purined from •» «*■• b y appropriately combing known separa- 
tionfeolation methods Examples of these methods indude salting out solvent predpitation. dialysis, gel foSn 
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electrophoresis, ion exchange chromatography, affinity chromatography and reversed phase high performance liquid 
chromatography. 

. (4) Preparation of antibody 

By using the tumor suppressor gene product or a fragment thereof as an antigen, an antibody is prepared. A poly- 
clonal antibody is prepared in accordance with a conventional method by, for example, sufficiently immunizing an ani- 
mal such as mouse, guinea pig and rabbit with the antigen by subcutaneously, intramuscularly, intraperitoneaily or 
intravenously administering it a number of times, sampling the blood from the animal and then separating the serum to 
obtain the antibody. A commercially available adjuvant is also usable therefor. 

A monoclonal antibody can be prepared by a known method. For example, spleen ceils of a mouse immunized with 
the antigen described above are fused with commercially available mouse myeloma cells to thereby give hybridomas. 
Then the target monoclonal antibody can be prepared from the culture supernatant of the hybridoma or the ascites fluid 
of a mouse inoculated with the hybridoma. 

It is not necessary that the tumor suppressor gene product to be used as the antigen has the whole amino acid 
structure but a peptide having a partial structure thereof, a modified peptide, its derivative or a fused peptide formed by 
fusing this peptide with another peptide are also usable. These substances may be prepared by any of the biological 
technique and chemical synthesis technique. 

These antibodies enable the identification and determination of the peptide of the present invention in human bio- 
logical samples and thus are applicable to, for example, diagnostic drugs for diseases to which the polypeptide is 
related. The peptide can be immunologically assayed in accordance with any of the known methods including the fluo- 
rescent antibody method, the passive agglutination method and the enzyme-labeled antibody technique. 

(5) Gene analysis of human organic tissues 

Examples of the biological sample usable in the gene analysis include normal human tissues, various types of 
human tumor tissues, human blood, human bodily fluids and human secretions. The DNA of the employed tissue may 
be extracted and prepared by, for example, the method reported by Sato, T., et al. [Cancer Res., 5Q, 7184 (1990)]. 

From the DNA sequence provided by the present invention, a part DNA sequence at an appropriate position is 
selected and a synthetic oligonucleotide having this sequence or one complementary thereto is used as a primer, a 
probe or a marker. Thus the occurrence of a mutation of this gene in man and the morphology of the mutation can be 
analyzed. Furthermore, alterations (insertion, deletion, etc.) of this gene in a sample can also be detected by these 
analyses. 

The part DNA sequence may be selected from any part of the DNA sequence of the above-mentioned gene. It is 
needless to say that an artificially modified DNA sequence may be used therefor and thus the corresponding gene 
mutation can be detected. 

The analysis may be effected by, for example, the following method. Namely, primers of two sequences are 
selected and the partial sequence between them is amplified by the PCR method. Then the amplified DNA sequence 
is directly analyzed. Alternatively, this amplification product is integrated into a plasmid in the same manner as that of 
the above-mentioned case and host cells are transformed thereby. After culturing the transformant thus obtained, the 
DNA sequence of the clone thus obtained is analyzed. Further, the ligase chain reaction method may be applied to the 
amplification [Wu et al., Genomics, 4, 560 - 569 (1 989)]. Furthermore, a specific mutation it^th^ above-mentioned gene 
in a sample can be detected by using the allele-specrfic PCR [Ruano and Kidd, Nucleic Acid Research, 17, 8392 
(1989)] or the ARMS method [C.R. Newton et al., Nucleic Acid Research, 17, 2503 - 2517 (1989)]. 

Similarly a point mutation can be detected by the SSCP method [Orita et al., Proc. Natl. Acad. Sic. USA, g£, 2766 
- 2770 (1989); and Genomics, §. 874 - 879 (1989)] or the RNase-protection method with the use of probes containing 
the DNA sequence thus selected or an RNA sequence originating therein. By using these probes, a mutation in the 
above-mentioned gene in a sample can be detected by the Southern hybridization method or an abnormality in the 
expression level of this gene in a sample can be examined by the Northern hybridization method. 

Escherichia coli DH5aF/pAB1 , pFL2 and pCE9 each carrying a plasmid containing the cDNA of this tumor sup- 
pressor gene were deposited with National Institute of Bioscience and Human-Technology, Agency of Industrial Sci- 
ence and Technology, Ministry of International Trade and Industry under accession numbers FERM P-14127, 14128 
and 1 41 29, respectively, on February 8, 1 994, and they were changed to International deposition under accession num- 
bers FERM BP-4923, 4924 and 4925, respectively, on December 9. 1994. 

The DNA of the present invention has a structure homologous with those of transcriptional factors, and originates 
in the most restricted commonly deleted region on chromosome 11 in MEN 1 -associated tumors. Therefore, it is 
expected that the DNA of the present invention may be a novel tumor suppressor gene. The DNA may be used as a tool 
in a gene therapy. Further, the fragment of the DNA may be used in the gene analysis of the DNA and in the diagnosis 
of diseases to which the DNA relates. 
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tkJilTK^ C ° dedf0rbytheDNA acc0Kfin 9 to the resent invention may be used as a reagent for investfaa- 

S^Si^^T 9 amXXty - 71,61 ****** «V be in »• dative or quantftJe^S^e 
polypeptide ,n a taotogaal sample. Thus, it is expected that the antibody may be useful as anoS^S^Z 

* Brief Description of the Drawings 

' theL ? H , a jj^ S ^ geneexisfsbytheOntegeanafysisand 

,o derivSmif^ 9ram Sh0W " 19 ^ d0neS ° ne ■» domain structure of 2FM1 cDMA 

Rg. 3 is a diagram showing the homology of the 2FM1 protein with WT1 or EGR2. 

Fig. 4 is a diagram showing the constitution of exons of the ZFM1 gena The exons are reoresent«t hv 1 tn ia Th« 
domains observed in cDNA are represented by A to H. 6 r€presemed by1to14 The 

'5 Examples 

HowSiT^ ^ *** in the blowing Examples wfll be given. 

However rt is to be understood that the present invention is not restricted these Examples only. 

20 Example 1 1solation and Dnkage analysis of cosmid clones specific for chromosome 1 1 

At the early stage of studies, it was reported based on the linkage with a PYGM (muscle olvcoaen nhn^h^_ 

ZT Z^^' 87 i 1 ml Subsequent 'y- rt ™ s Wed that it existed in a region of 1 2cM located betweenDT Si 49 
maker and INT2 marker of 11q13 [Nakamura et al„ Am. J. Hum Genet 44 751 - 755 nMW\tZ~™Z 1! 

dones "ng a part of the human chromosomal DNA with the use of a whole human DNA as^So^af 
Am. J. Hum. Genet. 43.258-268 (1991); and Tanigami ef at. Am. J. Hum. Genet, jffl. 56 - 64 (iS S, 2^ 

^TaT/h^g^^ 

MM^t^Z&v -i V E 992)1 WWe °" 1,16 <*™iosome through the fluorescent inSu 

Wzabor .(FISH) method [Hon et al.. Genomes. 13, 129 -133 (1992)r By effecting me linkage analysis wim me u^ 
of the cosmid markers whereby RFLP could be detarteH tha inrotinn „« ...... u «i yws wnn me use 
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Example 2 Preparation of deletion map of chromosome 1 1 in MENI-associated tumois 

On the other hand, investigations on the toss of heterozygosity (tOH) of the chromosome 1 1 in MENl-assnrfatad 

« £Ti X a '^ SU9geS,ed 1,191 *• tow aWressor * ene exists «he' aboveJSSgton "nSSLSST? 
« Engl. J. Med.. 32L 213 - 218 (1989); Thakker et al.. N. Engl. J. Med 321 218 - 224 fl989V and Zu>TH r 

Re^5L1154-1157(1991)]. ft hasbeentormerpoimedoUbymemapS^^ 

SZ, 1968 1972 (1990)]. The resuKs of the examination on tOH are arranged together with' tha- results of th« linta™ 

* ESt* * fe ■"" oon "" ld •* 11,6 MEN1 9ene — in a «*» - SSSS ESSE JKSJmSS 



Example 3 Preparation of physical map of 1 1q13 region 



cleaved human genomic DNA with 8 restriction enzymes each having a rare breakaaa noint A»»r cono^ 
ZSS^tT 156 961 e,ec *^ es * ^^Z, dotting .ni^ESSSSj S^SHZ 
S^f e ° S,,n9 1,q13 38 probes - *• relationship in locations among the coJdonSS 
danfied depending upon the capability of each clone of being hybridized with a comnWaenonfeDNAte^ ^1 
result, rt has been found out that cCI1 1-4. cCI1 1 -367. cCI1 1-364 cCI1 1-247 rtSS^ST J 
hybridized with genomic DNA fragments relating to one another aSmus »Sy^ ocafed^ a oSST^ 

cCI1 1-4. among these cosmid dones. are markers closest to the MENI gene (tod values- 5 03 and 5 131 rF,«TmoriM 
at, Am. J. Hum. Genet. 50. 399 - 403 (1992)]. Based on toe results of me mapping ^br^S£2£E^ 

S 1908F l and 199AT iso,ated by PYGM *!S2dES2SJ 

367. among the 6 cosmid clones as described above, is also close to PYGM. 



6 



EPO 727486 A2 



Example 4 Isolation of exon sequence from 1 1q13 region 

As described above. cCI1 1-4 and cCIl 1-367 are cosmid clones which are closest to the MEN1 gene. Thus, an 
attempt was made to isolate exons from these 2 cosmid dones by the exon trapping method [Buckler, A., et al., Proc. 

5 Natl. Acad. Sci. USA. 4005 - 4009 (1991)]. The cosmid DNA was cleaved with Bgill or BamHI, or both of these 
enzymes, and the fragment thus obtained was linked to the BamHI site of an exon splicing vector pSPL1 . Transection 
into COS-7 cells and isolation of exon sequences by the reverse transcription PCR (RT-PCR) were effected each in 
accordance with the procedure described in the original paper. Consequently, 3 exon sequences originating in cCH1- 
367 were obtained and named respectively S367EI, S367E2 and S367E4. These exon sequences were respectively in 

w sizes of 147 bp. 76 bp and 129 bp. 

Example 5 Isolation of full-length cDNA 

By using S367E4 (i.e., one of the exon sequences obtained in the above Example 4) as probe, a human cortical 

is cDNA library was screened. Thus, a clone AB1 carrying a cDNA insert of 1 kb was obtained. .With the use of this clone 
AB1 as a probe, further, a cDNA done FL2 was obtained from a human fetal liver cDNA library while cD NA clones CE5, 
CE9 and CE16 were obtained from a human cerebellar cDNA library. Then, it was confirmed that each of these dones 
could be hybridized with the original cosmid done cCI1 1 -367 and mapped on the chromosome 1 1q13 with a hybrid cell 
line panel. A sequence constructed by overlapping these cDNA clones one another at the common parts corresponded 

20 toZFMI cDNA of 3200 bp (SEQ ID NO:1). This 2FM1 cDNA contained an open reading frame (ORF) of 1869 bp which 
corresponded to a sequence of base Nos. 383 to 2251 in SEQ ID NO:1 . Based on the information as will be described 
hereinbelow, it has been proved that the sequence of SEQ ID NO: 1 and that of each clone can be regarded as being 
composed of 6 domains A (base Nos. 1 to 413 in SEQ ID NO:1), B (base Nos. 414 to 542 in SEQ ID NO:1), C (base 
Nos. 543 to 618 in SEQ ID NO:1). D (base Nos. 619 to 1964 in SEQ ID NO:1). E (base Nos. 1965 to 2218 in SEQ ID 

25 NO:1) and F (base Nos. 2219 to 3200 in SEQ ID NO:1) and domains G and H which are completely different therefrom. 
Namely, the exon sequences S367E2 and S367E4 obtained in the above Example 4 corresponded respectively to the 
domains C and B. The cDNA done CE5 lacked in a domain E consisting of 254 base pairs corresponding to a sequence 
of base Nos. 1965 to 2218 in SEQ ID NO:1, which may be due to an alternative splicing. The cDNA done AB1 con- 
tained domains A and B and the different one G but not the domains C, D, E and F. The cDNA clone CE16 consisted of 

30 the domains D and E and the different one H (see Pig. 2). 

Example 6 Characteristics of the structure of protein coded for by the tumor suppressor gene 

A protein coded for by ZFM1 cDNA consisted of 623 amino add residues and had a nuclear localizing signal con- 

35 taining basic amino acids in the N-termina! side. Further, a sequence C-X2-C-X4-H-X4-C (amino add Nos. 279 - 292) 
had characteristics of a zinc finger motif existing in a DNA binding protein. 1 1 8 proline residues were contained in this 
ZFM1 protein. In particular. 69 prdine residues were contained in a region of amino acid Nos. 420 to 623 thereof. The 
sequence of this region showed high homologies with Wilms* tumor suppressor gene product (WT1) [Gessier et al., 
Nature. £43. 774 - 778 (1990)] and early growth response 2 (EGR2) protein as a transcriptional factor (27.3% and 

40 24.0%, respectively) (see Fig. 3). WT1 is a transcription factor having a Kruppel-like zinc finger motif [Rosenberg et al.. 
Nature, 319, 336 - 339 (1986)]. EGR2 is a human homologue of an early growth response gene Krox-20 [Chavier et al., 
EM BO J. 7, 29 - 35 (1988)] which is expressed at the G0-G1 junction in the cell cyde of qiltuted mouse cells and it is 
also a transcriptional factor [Joseph et al., Proc. Natl. Acad. Sd. USA, 8§, 7164 - 7168 (1988)]. The ZFM1 protein fur- 
ther had 7 proline repetitive sequences (each consisting of at least 4 proline residues located continuously) in the C- 

45 terminal side. One of these repetitive sequences followed a glutamine repetitive sequence and thus formed a structure 
which was almost the same as that of the hinge domain of a rrrineralocorticoid receptor [Arriza et al.. Science, 237. 268 
- 275 (1987)]. Such a hinge structure is essentially required in the communication between a hormone binding domain 
and a DNA binding domain [Krust et al., EMBO J., & 891 - 897 (1986); and Giguere et al., Cell, 46, 645 - 652 (1986)]. 
Further. mRNAs of a number of types originating in the ZFM1 gene were expressed in hormone-produdng organs such 

so as pancreas, thyroid, adrenal gland and ovarium (see Table 1 in Example 8). 

These facts indicate that the ZFM1 protein is a tumor suppressor gene which is localized in the nuclei and exerts 
its function by binding to DNA and thus suppressing the proliferation of cells and that ZFM1 is a gene which participates 
in the onset of MEN 1. 

55 Example 7 Structure of genomic gene 

Based on the cosmid done containing the ZFM1 gene, the genomic structure of the ZFM1 gene was determined. 
The ZFM1 gene existed over a region of about 20 kb in the genomic DNA and consisted of 14 exons (see Fig. 4). As 
Fig. 4 shows, it has been revealed that these exons (Nos. 1 to 1 4) and the domains A to H described in the above Exam- 
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2a. ard domain H = exon aT^ 1 3 ""' 3 * 6X0,1 1 4 " """^ F = ■ P« °' exon 14. domain G = ex.* 

^SSSSSiS ^ .a and 3a. secuence o, tne 
exon 14.0n the ome, hand me domSTc^^ 

exon 2a which directly follows the exon 2 Jfi™w«T^ „ cons,stol 9 of domains A-B-G is coded for by the 
ir^ofthedorrair^ 

domain D. ^ me eX0n 3awh,cf " s located ""mediately before the exon 4 coding for the 

Example 8 Expression of ZFM1 gene in human tissues 

ft is considered that the larger mRNA ^sTSS S^J^L* T™**"* in a " °» ,hese tissues, 
^"^"correspondstoonec^ 

the expression of the ZFM1 gene in greater detail me revere tan^^ PrTf^^ (See ^ 2) " To 
extractng RNAs from ^ SXTEST^Sr?-^ ^ * 

respective domains. As a result, the expressions ofZFM i m^f^ . heads ,n R 9- 2) specific for the 
differences in splicing, were S^SSSl^i ^ Were *> °« due to 

A-^D.A-B^andH-DfseeRg^^^ 

another. In contrast, the expression of a mRNA h^^Z^f„T ° expression *** differed from one 

ovarium (S ee Table 1). s on of a mRNA having the domain E was restricted to heart, pancreas, thyroid and 
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SEQUENCE LISTING 
(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(11) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) FEATURE KEY: 5'UTR 

(B) LOCATION: 1. .382 

(A) FEATURE KEY: CDS 

(B) LOCATION: 383.. 2254 

(A) FEATURE KEY: exon 1 

(B) LOCATION: 1..413 

(A) FEATURE KEY: exon 2 

(B) LOCATION: 414. .542 

(A) FEATURE KEY: exon 3 

(B) LOCATION: 543. .618 

(A) FEATURE KEY: exon 4 

(B) LOCATION: 619.. 771 

(A) FEATURE KEY: exon 5 

(B) LOCATION: 772.. 861 



10 



EPO 727486 A2 



(A) FEATURE KEY: exon 6 

(B) LOCATION: 862.. 1045 

(A) FEATURE KEY: exon 7 

(B) LOCATION: 1046.. 1161 

(A) FEATURE KEY: exon 8 

(B) LOCATION: 1162.. 1269 

(A) FEATURE KEY: exon 9 

(B) LOCATION: 1270.. 1450 

(A) FEATURE KEY: exon 10 

(B) LOCATION: 1451.. 1724 

(A) FEATURE KEY: exon 11 

(B) LOCATION: 1725.. 1784 

(A) FEATURE KEY: exon 12 

(B) LOCATION: 1785.. 1964 

(A) FEATURE KEY: exon 13 

(B) LOCATION: 1965.. 2137 

(A) FEATURE KEY: exon 14 

(B) LOCATION: 2138.. 3132 

(A) FEATURE KEY: 3'UTR 

(B) LOCATION: 2280.. 3200 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CGTTGCTGTC GAAATGAAGT GCGCGCTGCG ACACCTCCCA GCCCACCGAA CTCCGCCGCC 
ATTTCCTCGC TTGCCTAACG GTTCGGCCAA TCCCAGCGCG CATCAATGCC GGACTGAGGC 
TCCGCCAATC GGAGGCCGCC GATTTCGACC CTTCGCCTCG GCCCGGCCCA ATCCATTCCC 
CGGCCCCGCC GCCCCCGGCC CGCCCCCGCG GTGCCCTCTC TCCTCCCTCT TTGTGCGTCT 
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CGCGCCGCCG CCGCCCGCCG CGTGAGAGGA CGGGCTCCGC GCGCTCCGGC AGCGCATTCG 300 
GGTCCCCTCC CCCCGGGAGG CTTGCGAAGG AGAAGCCGCC GCAGAGGAAA AGCAGGTGCC 360 
GGTGCCTGTC CCCGGGGGCG CC ATG GCG ACC GGA GCG AAC GCC ACG CCG TTG 412 

Met Ala Thr Gly Ala Asn Ala Thr Pro Leu 
1 5 io 

GAC TTC CCA AGT AAG AAG CGG AAG AGG AGC CGC TGG AAC CAA GAC ACA 460 
Asp Phe Pro Ser Lys Lys Arg Lys Arg Ser Arg Trp Asn Gin Asp Thr 

15 20 25 

ATG GAA CAG CCG ACA GTG ATT CCA GGA ATG CCT ACA GTT ATT CCC CCT 508 
Met Glu Gin Pro Thr Val lie Pro Gly Met Pro Thr Val He Pro Pro 

30 35 40 

GGA CTT ACT CGA GAA CAA GAA AGA GCT TAT ATA GTG CAA CTG CAG ATA 556 
Gly Leu Thr Arg Glu Gin Glu Arg Ala Tyr He Val Gin Leu Gin He 

45 50 55 

GAA GAC CTG ACT CGT AAA CTG CGC ACA GGG GAC CTG GGC ATC CCC CCT 604 
Glu Asp Leu Thr Arg Lys Leu Arg Thr Gly Asp Leu Gly He Pro Pro 

60 65 70 

AAC CCT GAG GAC AGG TCC CCT TCC CCT GAG CCC ATC TAC AAT AGC GAG 652 
Asn Pro Glu Asp Arg Ser Pro Ser Pro Glu Pro He Tyr Asn Ser Glu 
75 80 85 i _ 90 

GGG AAG CGG CTT AAC ACC CGA GAG TTC CGC ACC CGC AAA AAG CTG GAA 700 
Gly Lys Arg Leu Asn Thr Arg Glu Phe Arg Thr Arg Lys Lys Leu Glu 
95 ioo 105 
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GAG GAG CGG CAC AAC CTC ATC ACA GAG ATG GTT GCA CTC AAT CCG GAT 748 
Glu Glu Arg His Asn Leu He Thr Glu Met Val Ala Leu Asn Pro Asp 

110 115 120 

TTC AAG CCA CCT GCA GAT TAC AAA CCT CCA GCA ACA CGT GTG AGT GAT 796 
Phe Lys Pro Pro Ala Asp Tyr Lys Pro Pro Ala Thr Arg Val Ser Asp 
125 130 135 

,5 AAA GTC ATG ATT CCA CAA GAT GAG TAC CCA GAA ATC AAC TTT GTG GGG 844 
Lys Val Met He Pro Gin Asp Glu Tyr Pro Glu He Asn Phe Val Gly 
140 145 150 

a? 

CTG CTC ATC GGG CCC AGA GGG AAC ACC CTG AAG AAC ATA GAG AAG GAG 892 
Leu Leu He Gly Pro Arg Gly Asn Thr Leu Lys Asn He Glu Lys Glu 
25 155 160 165 170 

TGC AAT GCC AAG ATT ATG ATC CGG GGG AAA GGG TCT GTG AAA GAA GGG 940 
Cys Asn Ala Lys He Met He Arg Gly Lys Gly Ser Val Lys Glu Gly 

175 180 185 

AAG GTT GGG CGC AAA GAT GGC CAG ATG TTG CCA GGA GAA GAT GAG CCA 988 
Lys Val Gly Arg Lys Asp Gly Gin Met Leu Pro Gly Glu Asp Glu Pro 
190 195 200 

40 CTT CAT GCC CTG GTT ACT GCC AAT ACA ATG GAG AAC GTC AAA AAG GCA 1036 

Leu His Ala Leu Val Thr Ala Asn Thr Met Glu Asn Val Lys^Lys Ala 
205 210 215 

45 

GTG GAA CAG ATA AGA AAC ATC CTG AAG CAG GGT ATC GAG ACT CCA GAG 1084 
Val Glu Gin He Arg Asn He Leu Lys Gin Gly He Glu Thr Pro Glu 
50 220 225 230 



30 



35 



55 
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GAC CAG 
Asp Gin 
235 

AAT GGG 
Asn Gly 

AGC TCA 
Ser Ser 

GGA GGG 
Gly Gly 

GAT CCT 
Asp Pro 
300 
TCC CTC 
Ser Leu 
315 

TCC ACC 
Ser Thr 



AAT GAT 
Asn Asp 

ACC CTT 
Thr Leu 

GGG ACC 
Gly Thr 
270 
GCT GGC 
Ala Gly 
285 

CAG TCA 
Gin Ser 



CTA CGG AAG ATG 
Leu Arg Lys Met 
240 

CGG GAA GAC GAT 
Arg Glu Asp Asp 
255 

CGC AGC ATT ACC 
Arg Ser He Thr 



CAC ATT 
His He 

GCT CAG 
Ala Gin 



ATG GCT 
Met Ala 



GCT GCT 
Ala Ala 



TCT GGG 
Ser Gly 

CCC GCC 
Pro Ala 
350 



GAA CTG 
Glu Leu 
320 
CCT GCC 
Pro Ala 
335 

AAC AAC 
Asn Asn 



GCT TCA 
Ala Ser 
290 
GAT AAA 
Asp Lys 
305 

GGT GAA 
Gly Glu 

ACC ACA 
Thr Thr 

CCA CCT 
Pro Pro 



CAG CTT 
Gin Leu 

AAC AGG 
Asn Arg 
260 
AAC ACC 
Asn Thr 
275 

GAC TGT 
Asp Cys 



CGG GAG TTG GCT 
Arg Glu Leu Ala 
245 

ATC TTA AGA CCC 
He Leu Arg Pro 



ACA GTG 
Thr Val 

AAA TTC 
Lys Phe 



GCA CGG 
Ala Arg 

GCA CCT 
Ala Pro 

CCC CTG 
Pro Leu 
340 
CCA CCG 
Pro Pro 
355 



ATG GAT 
Met Asp 
310 
GTC CCA 
Val Pro 
325 

GCC AGC 
Ala Ser 



TGT ACC 
Cys Thr 
280 
CAA AGG 
Gin Arg 
295 

AAA GAA 
Lys Glu 



CGC TTA 
Arg Leu 
250 
TGG CAG 
Trp Gin 
265 

AAG TGT 

Lys Cys 

CCT GGT 
Pro Gly 

TAT TTG 
Tyr Leu 



1132 



GCA TCT 
Ala Ser 

GCA CCT 
Ala Pro 



TCT CTC ATG TCT 
Ser Leu Met Ser 
360 



GTG GGC 
Val Gly 
330 
CGT CCT 
Arg Pro 
345 

ACC ACC 
Thr Thr 



1180 



1228 



1276 



1324 



1372 



1420 



1468 
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15 



20 



25 



30 



35 



45 



50 



CAG AGC CGC 
Gin Ser Arg 
365 

TAC CAC GGC 
Tyr His Gly 

380 
CAC AGC TTC 
His Ser Phe 
395 

CAT CCC ATG 
His Pro Met 

CCA CCA CCA 
Pro Pro Pro 

GGC CCT CCT 
Gly Pro Pro 
445 

GGG GTC TAT 
Gly Val Tyr 

460 
ATG GGC ATG 
Met Gly Met 
475 



CCA CCC TGG ATG 
Pro Pro Trp Met 



ATG CAT 
Met His 

CCA CAC 
Pro His 

CAG CAC 
Gin His 
415 
CCA CCG 
Pro Pro 
430 

CCA ATG 
Pro Met 



GGA GGT 
Gly Gly 
385 
CCA TTA 
Pro Leu 
400 

AAC CCC 
Asn Pro 

ATG AAC 
Met Asn 

GAT CAG 
Asp Gin 



CGC CTG CAT CAA 
Arg Leu His Gin 
465 

ATG CCG CCG CCG 
Met Pro Pro Pro 
480 



AAT TCT 
Asn Ser 
370 

GGT CCT 
Gly Pro 

CCC AGC 
Pro Ser 

AAT GGA 
Asn Gly 

CAG GGC 
Gin Gly 
435 
TAC CTG 
Tyr Leu 
450 

GGA AAA 
Gly Lys 

CCG CCG 
Pro Pro 



GGT CCT 
Gly Pro 

GGT GGG 
Gly Gly 

CTG ACA 
Leu Thr 
405 
CCC CCA 
Pro Pro 
420 

CCC CAC 
Pro His 

GGA AGT 
Gly Ser 

GGT ATG 
Gly Met 

CCT CCC 
Pro Pro 
485 



TCA GAG 
Ser Glu 
375 
CCC GGA 
Pro Gly 
390 

GGT GGG 
Gly Gly 

CCC CCT 
Pro Pro 

CCT CCT 
Pro Pro 

ACG CCT 
Thr Pro 
455 
ATG CCG 
Met Pro 
470 

AGT GGG 
Ser Gly 



AGT TGG CCC 
Ser Trp Pro 

GGT GGC CCC 
Gly Gly Pro 

CAT GGT GGA 
His Gly Gly 
410 

TGG ATG CAG 
Trp Met Gin 

425 
GGG CAC CAT 
Gly His His 
440 

GTG GGC TCT 
Val Gly Ser 

CCA CCA CCT 
Pro Pro Pro 

CAG CCC CCA 
Gin Pro Pro 
490 



1516 



1564 



1612 



1660 



1708 



1756 



1804 



1852 



55 
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w 



15 



25 



30 



40 



45 



SO 



55 



CCC CCT CCC TCT GGT CCT CTT CCC CCA TGG CAA CAA CAG CAG CAG CAG 1900 
Pro Pro Pro Ser Gly Pro Leu Pro Pro Trp Gin Gin Gin Gin Gin Gin 

495 500 505 

CCT CCG CCA CCC CCT CCG CCC AGC AGC ACT ATG GCT TCC AGT ACC CCC 1948 
Pro Pro Pro Pro Pro Pro Pro Ser Ser Ser Met Ala Ser Ser Thr Pro 

510 515 520 

TTG CCA TGG CAG CAA AAT ACG ACG ACT ACC ACC ACG AGC GCT GGC ACA 1996 
Leu Pro Trp Gin Gin Asn Thr Thr Thr Thr Thr Thr Ser Ala Gly Thr 

525 530 535 

GGG TCC ATC CCG CCA TGG CAA CAG CAG CAG GCG GCT GCC GCA GCT TCT 2044 
Gly Ser lie Pro Pro Trp Gin Gin Gin Gin Ala Ala Ala Ala Ala Ser 

540 545 550 

CCA GGA GCC CCT CAG ATG CAA GGC AAC CCC ACT ATG GTG CCC CTG CCC 2092 
Pro Gly Ala Pro Gin Met Gin Gly Asn Pro Thr Met Val Pro Leu Pro 
555 560 565 570 

CCC GGG GTC CAG CCG CCT CTG CCG CCT GGG GCC CCT CCC CCT CCG CCC 2140 
Pro Gly Val Gin Pro Pro Leu Pro Pro Gly Ala Pro Pro Pro Pro Pro 

575 580 585 

CGT. AGC ATC GAG TGT CTT CTT TGT CTT CTT TCT CTC CTC ACC CAA CTC 2188 
Arg Ser He Glu Cys Leu Leu Cys Leu Leu Ser Leu Leu Thr Gln_Leu 

590 595 600 

CCT TTG CCT CTC CCC AAA CCG GGC CGC CAG GAT CCC TCC CCG CGG CGG 2236 
Pro Leu Pro Leu Pro Lys Pro Gly Arg Gin Asp Pro Ser Pro Arg Arg 
605 610 615 
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CGA TGG CCC GAG CCA TGAGAGTGAG GACTTTCCGC GCCCATTGGT GACCCTTCCA 2291 
5 Arg Trp Pro Glu Pro 





620 


623 










10 


GGCAGACAGC 


CTCAGCAACG 


CCCCTGGTGG 


ACAGGATGGT TCGGCAAAGC 


AGCCTGAGTT 


2351 


ATTTTTGTGG 


ACGGAATCGG 


AACACGCTGG 


CTCCATATCG TGAAATTTTT 


ATTAATTTTT 


2411 




TTCTTTTTCC 


TTTGTTACTT 


CTTTATCTTT 


TCCTTTCTTC AGACTCCGTC 


CAAGGAGATG 


2471 


15 


CTCTCCCCGG 


TCTTCTGCTG 


CAATTTAGAT 


TCCTTTGGGT TCTCTCCAGT 


TCTCCTTCCC 


2531 




TTACCAAGGA 


GAGGGGAGCA 


AATGGTTTTG 


GGCAAGGGCT TTGGCCATTC 


ATGTCAAGCT 


2591 


20 


GGTTGTGGGT 


TTTTCAAGGT 


GCCATAGCCA 


CCCCCAAATA TGTTTGTTTA 


AAGCGTGGGG 


2651 




TTTTTTAATC 


TCTGCCACCC 


TTGTCAAGGG 


AGTCTTGTAA AGTTGCCGAG 


GGTAGGTTCA 


2711 




TCTCCAGGTT 


TCGGGATTCC 


CATCCGTCCT 


GGCGATCCTG CCAGCAGTGG 


GTGGGCAGCC 


2771 


25 


TGAGCTCCCT 


CGGGCTCGCC 


TGCCAGCCTG 


GAGTTCTTCC TGTGCTCCTT 


GATCACCTGA 


2831 




GCTGCCTCAG 


ATTCCATTTG 


GTCCTCTCCT 


TCCTGGAAGG CTTCCTTTTA 


TGTTTTGTTT 


2891 


30 


TAATCCCAAA 


TGTCTGAATG 


TTTTGCAGTG 


TGTAGGGGTT TGAGCCCCTT 


GTTCATTCTC 


2951 




CTTCCTTTTT 


CCTCCCGCTT 


CCCTCTCCAT 


GAAGTGATTC TGTTGACAAT 


AATGTATACT 


3011 




GCGCGTTCTC 


TTCACTGGTT 


TATCTGCAGA , AATTTCTCTG GGCTTTTTTC 


GGTGTTAGAT 


3071 


35 


TCAACACTGC 


GCTAAAGCGG , GGATGTTCCA , TTGAATAAAA GAGCAGTGTG 


GTTTTCTGGG 


3131 




AAAAAAAAAA 


AAAAAAAAAA, AAAAAAAAAA, AAAAAAAAAA AAAAAAAAAA 


AAAAAAAAAA 


3191 


40 


AAAAAAAAA 










3200 



45 



50 



55 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

U) APPLICANT: 

(A) NAME: Cancer Institute 

(B) STREET: 37-1, Kamiikebukuro 1-chome, 

(C) CITY: Toshima-ku, Tokyo 

(E) COUNTRY: Japan 

(F) POSTAL CODE (ZIP) : none 

(A) NAME: Eisai Co., Ltd. 

(B) STREET: 6-10, Koishikawa 4-chome, 

(C) CITY: Bunkyo-ku, Tokyo 
(E) COUNTRY: JAPAN 

(T) POSTAL CODE (ZIP) : 112 

(ii) TITLE OF INVENTION: Tumor Suppressor Gene 
(iii) NUMBER OF SEQUENCES: 2 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: EP 95101980.1 

(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3200 base pairs 

(B) TYPE: nucleic acid 
30 (C) ST RAND EDN ESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME/ KEY: 5'UTR 
35 (B) LOCATION: 1. .382 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 383. .2254 

40 (ix) FEATURE: 

(A) NAME/ KEY: exon 1 

(B) LOCATION : 1 . .413 

(ix) FEATURE: 

(A) NAME/KEY: exon 2 
^ (B) LOCATION: 414 . . 542 

(ix) FEATURE: 

(A) NAME/KEY: exon 3 

(B) LOCATION:543..618 

(ix) FEATURE: 
50 (A) NAME/ KEY: exon 4 

(B) LOCATION: 619.. 771 
(ix) FEATURE: 

(A) NAME/ KEY: exon. 5 

(B) LOCATION: 772. .861 



55 
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(ix) FEATURE : 

(A) NAME/ KEY: exon 6 

(B) LOCATION: 862.. 1045 

(ix) FEATURE: 

(A) NAME/ KEY: exon 7 

(B) LOCATION: 1046. . 1161 

(ix) FEATURE: 

(A) NAME/KEY: exon 8 

(B) LOCATION: 11 62.. 12 69 

(ix) FEATURE: 

(A) NAME/ KEY: exon 9 

(B) LOCATION:1270..1450 

(ix) FEATURE: 

(A) NAME/KEY: exon 10 

(B) LOCATION: 1451.. 1724 

(ix) FEATURE: 

(A) NAME/KEY: exon 11 

(B) LOCATION: 1725. .1784 

(ix) FEATURE: 

(A) NAME/ KEY: exon 12 

(B) LOCATION: 17 85. .1964 

(ix) FEATURE: 

(A) NAME/KEY: exon 13 

(B) LOCATION: 19 65.. 2137 

(ix) FEATURE: 

(A) NAME/KEY: exon 14 

(B) LOCATION: 21 38. .3132 

(ix) FEATURE: 

(A) NAME/ KEY: 3*UTR 

(B) LOCATION: 2280. .3200 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



CGTTGCTGTC 


GAAATGAAGT 


GCGCGCTGCG 


ACACCTCCCA 


GCCCACCGAA CTCCGCCGCC 


ATTTCCTCGC 


TTGCCTAACG 


GTTCGGCCAA 


TCCCAGCGCG 


CATCAATGCC GGACTGAGGC 


TCCGCCAATC 


GGAGGCCGCC 


GATTTCGACC 


CTTCGCCTCG 


GCCCGGCCCA ATCCATTCCC 


CGGCCCCGCC 


GCCCCCGGCC 


CGCCCCCGCG 


GTGCCCTCTC 


TCCTCCCTCT TTGTGCGTCT 


CGCGCCGCCG 


CCGCCCGCCG 


CGTGAGAGGA 


CGGGCTCCGC 


GCGCTCCGGC AGCGCATTCG 


GGTCCCCTCC 


CCCCGGGAGG 


CTTGCGAAGG 


AGAAGCCGCC 


GCAGAGGAAA AGCAGGTGCC 


GGTGCCTGTC 


CCCGGGGGCG 


CC ATG GCG 


ACC GGA GCG AAC GCC ACG CCG TTG 



GAC TTC CCA AGT AAG AAG CGG AAG AGG AGC CGC TGG AAC CAA GAC ACA 
Asp Phe Pro Ser Lys Lys Arg Lys Arg Ser Arg Trp Asn Gin Asp Thr 



Met Ala 
1 



Thr Gly Ala Asn Ala Thr Pro Leu 
5 10 



15 



20 



25 
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10 



15 



30 



50 



AAC CCT GAG GAC AGG TCC CCT TCC CCT GAG CCC ATC TAC AAT AGC GAr 
Asn Pro Glu Asp Arg Ser Pro Ser Pro Glu Pro He Tyt £n 55 £u 
80 85 go 

GGG AAG CGG CTT AAC ACC CGA GAG TTC CGC ACC CGC AAA AAG CTG GAA 
Gly Lys Arg Leu Asn Thr Arg Glu Phe Arg Thr Arg Lys Lys Leu Glu 
95 100 105 



125 130 

25 ^ f 6 ATT CCA CAA GAT GAG TAC CCA GAA ATC AAC TTT GTG GGG 

Lys Val Met He Pro Gin Asp Glu Tyr Pro Glu lie Asn Phe Val Gly 
A * u 145 iso 



155 160 165 170 

TGC AAT GCC AAG ATT ATG ATC CGG GGG AAA GGG TCT GTG AAA GAA GGG 
Cys Asn Ala Lys lie Met He Arg Gly Ly, Gly Ser Val £u" £y 
175 180 185 y 

£s vll %Z *** GGC 06 ATG TTG CCA GGA GAA GAT GAG CCA 
35 Lys Val Gly Arg Lys Asp Gly Gin Met leu Pro Gly Glu Asp Glu Pro 

190 195 200 

ml a? C ST ACT GCC MT ACA ATG SAG AAC GTC AAA AAG GCA 
Leu Hxs Ala Leu Val Thr Ala Asn Thr Met Glu Asn Val Lys Lys Ala 
205 210 215 



Asp Gin Asn Asp Leu Arg Lys Met Gin Leu Arg Glu Leu Ala Arg Leu 
45 ^ 240 245 * 250 

AAT GGG ACC CTT CGG GAA GAC GAT AAC AGG ATC TTA AGA CCC TGG CAG 
Asn Gly Thr Leu Arg Glu Asp Asp Asn Arg He Leu Arg Pro Trp £n 
255 260 265 



270 275 2 8 o 

GGA GGG GCT GGC CAC ATT GCT TCA GAC TGT AAA TTC CAA AGG CCT GGT 



308 



« 50 55 

GAA GAC CTG ACT CGT AAA CTG CGC ACA GGG GAC CTG GGC ATC CCC rrr 
Glu Asp Leu Thr Arg Lys Leu Arg Thr Gly Asp £u S£ 55 Pro £ 
65 *7n 



604 



652 



700 



GAG GAG CGG CAC AAC CTC ATC ACA GAG ATG GTT GCA CTC AAT CCG GAT iao 
Glu Glu Arg His Asn Leu lie Thr Glu Met Val aS Leu As^ Pro £p 
HO 115 120 

TTC AAG CCA CCT GCA GAT TAC AAA CCT CCA GCA ACA CGT GTG AGT GAT 796 
Phe Lys Pro Pro Ala Asp Tyr Ljs Pro Pro Ala Thr Arg Val Ser Asp 



844 



Su ?f l° C G ? G **= ACC CTG ^ ATA GAG AAG GAG 892 

Leu Leu lie Gly Pro Arg Gly Asn Thr Leu Lys Asn lie Glu Lys Glu 



940 



988 



1036 



55 G^ £n f ^° rT° CTG ^ GGT ATC *« ACT GCA GAG 1084 

Val Glu Gin lie Arg Asn lie Leu Lys Gin Gly lie Glu Thr Pro Glu 



GAC CAG AAT GAT CTA CGG AAG ATG CAG CTT CGG GAG TTG GCT CGC TTA* - 
Asp Gin Asn A«t> t.a» t„. m^- t „_ * . 1J -^ H32 



1180 



AGC TCA GGG ACC CGC AGC ATT ACC AAC ACC ACA GTG TGT ACC AAG TGT 1228 
Ser Ser Gly Thr Arg Ser lie Thr Asn Thr Thr Val Cys Thr Lys Cys 



1276 



55 
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15 



Gly Gly Ala Gly His He Ala Ser Asp Cys Lys Phe Gin Arq Pro Gly 
285 290 295 

GAT CCT CAG TCA GCT CAG GAT AAA GCA CGG ATG GAT AAA GAA TAT TTG 1324 
Asp Pro Gin Ser Ala Gin Asp Lys Ala Arg Met Asp Lys Glu Tyr Leu 
300 305 310 

TCC CTC ATG GCT GAA CTG GGT GAA GCA CCT GTC CCA GCA TCT GTG GGC 1372 
Ser Leu Met Ala Glu Leu Gly Glu Ala Pro Val Pro Ala Ser Val Gly 
315 320 325 330 

TCC ACC TCT GGG CCT GCC ACC ACA CCC CTG GCC AGC GCA CCT CGT CCT 1420 
Ser Thr Ser Gly Pro Ala Thr Thr Pro Leu Ala Ser Ala Pro Arg Pro 
335 340 345 

GCT GCT CCC GCC AAC AAC CCA CCT CCA CCG TCT CTC ATG TCT ACC ACC 1468 
Ala Ala Pro Ala Asn Asn Pro Pro Pro Pro Ser Leu Met Ser Thr Thr 
350 355 360 

. CAG AGC CGC CCA CCC TGG ATG AAT TCT GGT CCT TCA GAG AGT TGG CCC 1516 
Gin Ser Arg Pro Pro Trp Met Asn Ser Gly Pro Ser Glu Ser Trp Pro 
365 370 375 

20 TAC CAC GGC ATG CAT GGA GGT GGT CCT GGT GGG CCC GGA GGT GGC CCC 1564 

Tyr His Gly Met His Gly Gly Gly Pro Gly Gly Pro Gly Gly Gly Pro 
380 385 390 

CAC AGC TTC CCA CAC CCA TTA CCC AGC CTG ACA GGT GGG CAT GGT GGA 1612 
His Ser Phe Pro His Pro Leu Pro Ser Leu Thr Gly Gly His Gly Gly 
2$ 395 400 405 410 

CAT CCC ATG CAG CAC AAC CCC AAT GGA CCC CCA CCC CCT TGG ATG CAG 1660 
His Pro Met Gin His Asn Pro Asn Gly Pro Pro Pro Pro Trp Met Gin 
415 420 425 

CCA CCA CCA CCA CCG ATG AAC CAG GGC CCC CAC CCT CCT GGG CAC CAT 1708 
30 p ro Pro p ro p ro pro Met Asn Gin Gly Pro His Pro Pro Gly His His 

430 435 440 

GGC CCT CCT CCA ATG GAT CAG TAC CTG GGA AGT ACG CCT GTG GGC TCT 1756 

Gly Pro Pro Pro Met Asp Gin Tyr Leu Gly Ser Thr Pro Val Gly Ser 
445 450 455 

35 

GGG GTC TAT CGC CTG CAT CAA GGA AAA GGT ATG ATG CCG CCA CCA CCT 1804 

Gly Val Tyr Arg Leu His Gin Gly Lys Gly Met Met Pro Pro Pro Pro 
460 465 470 

ATG GGC ATG ATG CCG CCG CCG CCG CCG CCT CCC AGT GGG CAG CCC CCA 1852 
Met Gly Met Met Pro Pro Pro Pro Pro Pro Pro Ser Gly Gin Pro Pro 
40 475 480 485 490 

CCC CCT CCC TCT GGT CCT CTT CCC CCA TGG CAA CAA CAG CAG CAG CAG 1900 
Pro Pro Pro Ser Gly Pro Leu Pro Pro Trp Gin Gin Gin Gin Gln^SlTT 
495 500 505 

45 CCT CCG CCA CCC CCT CCG CCC AGC AGC AGT ATG GCT TCC AGT ACC CCC 1948 

Pro Pro Pro Pro Pro Pro Pro Ser Ser Ser Met Ala Ser Ser Thr Pro 
510 515 520 

TTG CCA TGG CAG CAA AAT ACG ACG ACT ACC ACC ACG AGC GCT GGC ACA 1996 
Leu Pro Trp Gin Gin Asn Thr Thr Thr Thr Thr Thr Ser Ala Gly Thr 
w 525 530 535 

GGG TCC ATC CCG CCA TGG CAA CAG CAG CAG GCG GCT GCC GCA GCT TCT 2044 
Gly Ser He Pro Pro Trp Gin Gin Gin Gin Ala Ala Ala Ala Ala Ser 
540 545 550 

55 
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w 



15 



25 



30 



40 



45 



G^ o CT ^ G m M ^ **= CCC AC * AT3 CTG CCC CTG CCC 

Pro Gly Ala Pro Gin Met Gin Gly Asn Pro Thr Met Val Pro £u Pro 

560 565 570 

CCC GGG GTC CAG CCG CCT CTG CCG CCT GGG GCC CCT CCC CCT rrr rrr 
Pro Gly val Gin Pro Pro Leu Pro Pro Gly Ala p£ Pro £ Pro Pro 

580 58S 

CGT AGC ATC GAG TGT CTT CTT TGT CTT CTT TCT CTC CTC Arc raa rrn 
Arg Ser lie Glu Cy, Leu Leu Cys Leu Leu £ Su L^ Thr G^ L^u 
590 59 * 600 

CCT TTG CCT CTC CCC AAA CCG GGC CGC CAG GAT CCC TCC rrr rnr ^ 
Pro Leu Pro Leu Pro Ly, Pro «. Arg Gln g « Ser Pro g S 
ou:> 610 615 

CGA TGG CCC GAG CCA TGA GAGTGAGGAC TTTCCGCGCC CATTGGTGAC 
Arg Trp Pro Glu Pro 
620 

CCTTCCAGGC AGACAGCCTC AGCAACGCCC CTGGTGGACA GGATGGTTCG GCAAAGCAGC 
CTGAGTTATT TTTGTGGACG GAATCGGAAC ACGCTGGCTC CATATCGTGA AATTTTTATT 
AATTTTTTTC TTTTTCCTTT GTTACTTCTT TATCTTTTCC TTTCTTCAGA CTCCGTCCAA 
GGAGATGCTC TCCCCGGTCT TCTGCTGCAA TTTAGATTCC TTTGGGTTCT CTCCAGTTCT 
CCTTCCCTTA CCAAGGAGAG GGGAGCAAAT GGTTTTGGGC AAGGGCTTTG GCCATTCATG 
TCAAGCTGGT TGTGGGTTTT TCAAGGTGCC ATAGCCACCC CCAAATATGT TTGTTTAAAG 
CGTGGGGTTT TTTAATCTCT GCCACCCTTG TCAAGGGAGT CTTGTAAAGT TGCCGAGGGT 
AGGTTCATCT CCAGGTTTCG GGATTCCCAT CCGTCCTGGC GATCCTGCCA GCAGTGGGTG 
GGCAGCCTGA GCTCCCTCGG GCTCGCCTGC CAGCCTGGAG TTCTTCCTGT GCTCCTTGAT 
CACCTGAGCT GCCTCAGATT CCATTTGGTC CTCTCCTTCC TGGAAGGCTT CCTTTTATGT 
TTTGTTTTAA TCCCAAATGT CTGAATGTTT TGCAGTGTGT AGGGGTTTGA GCCCCTTGTT 
CATTCTCCTT CCTTTTTCCT CCCGCTTCCC TCTCCATGAA GTGATTCTGT TGACAATAAT 
GTATACTGCG CGTTCTCTTC ACTGGTTTAT CTGCAGAAAT TTCTCTGGGC TTTTTTCGGT 
GTTAGATTCA ACACTGCGCT AAAGCGGGGA TGTTCCATTG AATAAAAGAG CAGTGTGGTT 
TTCTGGGAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAA 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 623 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



2092 

2140 

2188 

2236 

2284 

2344 
2404 
2464 
2524 
2584 
2644 

2704 

2764 

2824 

2884 

2944 

3004 

3064 

3124 

3184 

3200 



55 



22 



EP 0 727 486 A2 



Met Ala Thr Gly Ala Asn Ala Thr Pro Leu Asp Phe Fro Ser Lys Lys 
15 10 15 

Arg Lys Arg Ser Arg Trp Asn Gin Asp Thr Met Glu Gin Pro Thr Val 
20 25 30 

He Pro Gly Met Pro Thr Val He Pro Pro Gly Leu Thr Arg Glu Gin 
35 40 45 

Glu Arg Ala Tyr He Val Gin Leu Gin He Glu Asp Leu Thr Arg Lys 
50 55 60 

Leu Arg Thr Gly Asp Leu Gly He Pro Pro Asn Pro Glu Asp Arg Ser 
65 70 75 80 

Pro Ser Pro Glu Pro He Tyr Asn Ser Glu Gly Lys Arg Leu Asn Thr 
85 90 95 

Arg Glu Phe Arg Thr Arg Lys Lys Leu Glu Glu Glu Arg His Asn Leu 
100 105 110 

He Thr Glu Met Val Ala Leu Asn Pro Asp Phe Lys Pro Pro Ala Asp 
115 120 125 

Tyr Lys Pro Pro Ala Thr Arg Val Ser Asp Lys Val Met He Pro Gin 
130 135 140 

Asp Glu Tyr Pro Glu He Asn Phe Val Gly Leu Leu He Gly Pro Arg 
145 150 155 160 

Gly Asn Thr Leu Lys Asn He Glu Lys Glu Cys Asn Ala Lys He Met 
165 170 175 

He Arg Gly Lys Gly Ser Val Lys Glu Gly Lys Val Gly Arg Lys Asp 
180 185 190 

Gly Gin Met Leu Pro Gly Glu Asp Glu Pro Leu His Ala Leu Val Thr 
195 200 205 

Ala Asn Thr Met Glu Asn Val Lys Lys Ala Val Glu Gin He Arg Asn 
210 215 220 

He Leu Lys Gin Gly He Glu Thr Pro Glu Asp Gin Asn Asp Leu Arg 
225 230 235 240 

Lys Met Gin Leu Arg Glu Leu Ala Arg Leu Asn Gly Thr Leu Arg Glu 
245 250 255 

Asp Asp Asn Arg He Leu Arg Pro Trp Gin Ser Ser Gly Thr Arg Ser 
260 265 270 

He Thr Asn Thr Thr Val Cys Thr Lys Cys Gly Gly Ala Gl^ Hie He 
275 280 285 

Ala Ser Asp Cys Lys Phe Gin Arg Pro Gly Asp Pro Gin Ser Ala Gin 
290 295 300 

Asp Lys Ala Arg Met Asp Lys Glu Tyr Leu Ser Leu Met Ala Glu Leu 
305 310 315 320 

Gly Glu Ala Pro Val Pro Ala Ser Val Gly Ser Thr Ser Gly Pro Ala 
32S 330 335 

Thr Thr Pro Leu Ala Ser Ala Pro Arg Pro Ala Ala Pro Ala Asn Asn 
340 345 350 
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Pro Pro Pro Pro Ser Leu Met Ser Thr Thr Gin Ser Arg Pro Pro 7; 



355 360 



365 



Met Asn Ser Gly Pro Ser Glu Ser Trp Pro Tyr His Gly Met His Gly 
3/0 375 380 

Gly Gly Pro Gly Gly Pro Gly Gly Gly Pro His Ser Phe Pro His Pro 
385 39 <> 395 4 oo 

Leu Pro Ser Leu Thr Gly Gly His Gly Gly His Pro Met Gin His Asn 
405 410 415 

Pro Asn Gly Pro Pro Pro Pro Trp Met Gin Pro Pro Pro Pro Pro Met 
420 425 430 

Asn Gin Gly Pro His Pro Pro Gly His His Gly Pro Pro Pro Met Ast> 
435 440 445 

Gin Tyr Leu Gly Ser Thr Pro Val Gly Ser Gly Val Tyr Arg Leu His 
450 . 455 460 

Gin Gly Lys Gly Met Met Pro Pro Pro Pro Met Gly Met Met Pro Pro 
465 470 475 " 48O 

Pro Pro Pro Pro Pro Ser Gly Gin Pro Pro Pro Pro Pro Ser Gly Pro 
485 490 495 

Leu Pro Pro Trp Gin Gin Gin Gin Gin Gin Pro Pro Pro Pro Pro Pro 
500 505 510 

Pro Ser Ser Ser Met Ala Ser Ser Thr Pro Leu Pro Trp Gin Gin Asn 
515 520 525 

Thr Thr Thr Thr Thr Thr Ser Ala Gly Thr Gly Ser He Pro Pro Trp 
530 535 540 

Gin Gin Gin Gin Ala Ala Ala Ala Ala Ser Pro. Gly Ala Pro Gin Met 
545 550 555 560 

Gin Gly Asn Pro Thr Met Val Pro Leu Pro Pro Gly Val Gin Pro Pro 
565 570 * 575 

Leu Pro Pro Gly Ala Pro Pro Pro Pro Pro Arg Ser He Glu Cys Leu 
580 585 590 

Leu Cys Leu Leu Ser Leu Leu Thr Gin Leu Pro Leu Pro Lfcu-pro Lys 
595 600 605 

Pro Gly Arg Gin Asp Pro Ser Pro Arg Arg Arg Trp Pro Glu Pro 
610 615 620 



Claims 



A DNA comprising the full structure or a part of the DNA represented by SEQ ID NO-1 or 'a r>MA occ^ik. • 
to the DNA comprising the full structure or a part d *e 
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A transformant transformed by a plasmid having, integrated therein, the full structure or a part of the DNA repre- 
sented by SEQ ID NO:1 which can be expressed therein. 

An antibody against the polypeptide as set forth in Claim 2 as an antigen. 

A gene analysis method which comprises using, as a primer, a probe or a marker, a DNA comprising a part of the 
DNA represented by SEQ ID NO:1 and hybridizing the primer, the probe or the marker with a DNA to be tested. 
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